Skip to content

mp: sort flb_record_accessor by a number of subkeys(#5546)#5590

Merged
edsiper merged 6 commits intofluent:masterfrom
nokute78:fix_5546
Feb 14, 2023
Merged

mp: sort flb_record_accessor by a number of subkeys(#5546)#5590
edsiper merged 6 commits intofluent:masterfrom
nokute78:fix_5546

Conversation

@nokute78
Copy link
Contributor

@nokute78 nokute78 commented Jun 19, 2022

Fixes #5546

out_loki tries to remove key/value using flb_mp_accessor_keys_remove.
The API needs a key lists to indicate which key should be removed.

I tested the API and I found that the list should be sorted if it is passed nested key/value.
However the lists is not sorted at current master. It may not work if user passes nested key/value.

This patch is to fix it.

  • Add test code
  • Add API to get subkey number
  • Insert record_accessor sorted by subkey number

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Configuration

[INPUT]
    Name dummy
    dummy {"log":"blahblah","kubernetes":{"pod_name":"nginx", "namespace_name":"default"}}

[OUTPUT]
    Name loki
    auto_kubernetes_labels off
    labels pod=$kubernetes['pod_name']
    remove_keys kubernetes
    port 9200
    Log_Level debug

http_server.go:

package main

import (
	"fmt"
	"io"
	"log"
	"net/http"
	"strconv"
)

func main() {
	port := 9200
	ps := strconv.Itoa(port)

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		defer r.Body.Close()
		b, err := io.ReadAll(r.Body)
		if err != nil {
			log.Printf("ReadAll error=%s\n", err)
		} else {
			log.Printf("%+v\n", string(b))
		}
		//w.WriteHeader(400)
	})
	fmt.Printf("Listen port=%s\n", ps)
	err := http.ListenAndServe(":"+ps, nil)
	if err != nil {
		log.Printf("ListenAndServe err=%s\n", err)
	}
}

Debug log

$ ../bin/fluent-bit -c a.conf 
Fluent Bit v1.9.5
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/06/26 12:37:40] [ info] [fluent bit] version=1.9.5, commit=71e53850ab, pid=40379
[2022/06/26 12:37:40] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/06/26 12:37:40] [ info] [cmetrics] version=0.3.4
[2022/06/26 12:37:40] [debug] [output:loki:loki.0] remove_mpa size: 2
[2022/06/26 12:37:40] [ info] [output:loki:loki.0] configured, hostname=127.0.0.1:9200
[2022/06/26 12:37:40] [ info] [sp] stream processor started
[2022/06/26 12:37:41] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
[2022/06/26 12:37:42] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
^C[2022/06/26 12:37:42] [engine] caught signal (SIGINT)
[2022/06/26 12:37:42] [ warn] [engine] service will shutdown in max 5 seconds
[2022/06/26 12:37:42] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
[2022/06/26 12:37:43] [ info] [engine] service has stopped (0 pending tasks)

"kubernetes" field is removed.

$ go run http_server.go 
Listen port=9200
2022/06/26 12:37:41 {"streams":[{"stream":{"pod":"nginx"},"values":[["1656214660553074560","{\"log\":\"blahblah\"}"]]}]}
2022/06/26 12:37:42 {"streams":[{"stream":{"pod":"nginx"},"values":[["1656214661552289828","{\"log\":\"blahblah\"}"]]}]}

Valgrind output

$ valgrind --leak-check=full ../bin/fluent-bit -c a.conf 
==40385== Memcheck, a memory error detector
==40385== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==40385== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==40385== Command: ../bin/fluent-bit -c a.conf
==40385== 
Fluent Bit v1.9.5
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/06/26 12:38:56] [ info] [fluent bit] version=1.9.5, commit=71e53850ab, pid=40385
[2022/06/26 12:38:56] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/06/26 12:38:56] [ info] [cmetrics] version=0.3.4
[2022/06/26 12:38:56] [debug] [output:loki:loki.0] remove_mpa size: 2
[2022/06/26 12:38:56] [ info] [output:loki:loki.0] configured, hostname=127.0.0.1:9200
[2022/06/26 12:38:56] [ info] [sp] stream processor started
[2022/06/26 12:38:57] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
[2022/06/26 12:38:58] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
^C[2022/06/26 12:38:58] [engine] caught signal (SIGINT)
[2022/06/26 12:38:58] [ warn] [engine] service will shutdown in max 5 seconds
[2022/06/26 12:38:58] [debug] [output:loki:loki.0] 127.0.0.1:9200, HTTP status=200
[2022/06/26 12:38:59] [ info] [engine] service has stopped (0 pending tasks)
==40385== 
==40385== HEAP SUMMARY:
==40385==     in use at exit: 0 bytes in 0 blocks
==40385==   total heap usage: 1,399 allocs, 1,399 frees, 1,439,096 bytes allocated
==40385== 
==40385== All heap blocks were freed -- no leaks are possible
==40385== 
==40385== For lists of detected and suppressed errors, rerun with: -s
==40385== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@nokute78
Copy link
Contributor Author

Hmm, mk_list API can't add new value to head of the list.
I think it doesn't expect a head is changed.

@nokute78 nokute78 changed the title Fix 5546 mp: sort flb_record_accessor by a number of subkeys(#5546) Jun 19, 2022
@nokute78 nokute78 marked this pull request as ready for review June 26, 2022 03:39
@nokute78
Copy link
Contributor Author

It is ready to review.

@edsiper
Copy link
Member

edsiper commented Jun 27, 2022

@nokute78 pls submit mk_list changes directly to monkey, we will do the merge back.

@nokute78
Copy link
Contributor Author

nokute78 commented Jul 1, 2022

@edsiper I sent a patch monkey/monkey#371.
Could you review it ?

@nokute78
Copy link
Contributor Author

@edsiper @leonardo-albertovich ping.

@nokute78
Copy link
Contributor Author

monkey/monkey#371 was updated.
Unit test was OK using above patch.

$ bin/flb-it-mp
Test count...                                   [ OK ]
(snip)
Test accessor_keys_remove_subkey_key...         
=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}
[ OK ]
Test accessor_keys_remove_subkey_keys...        
=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}

=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}

=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}

=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}

=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}

=== ORIGINAL  ===
[0] {"key1"=>"something", "kubernetes"=>[true, false, {"a"=>false, "annotations"=>{"fluentbit.io/tag"=>"thetag", "extra"=>false}}]}
=== FINAL MAP ===
[0] {"key1"=>"something"}
[ OK ]
SUCCESS: All unit tests have passed.

@DevSusu
Copy link

DevSusu commented Sep 14, 2022

@edsiper @leonardo-albertovich ping.

return -1;
}

return mk_list_size(rp->key->subkeys);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware of mk_list_size, it iterates the entire list in order to calculate the size each time. At this point I think we might just waste 8 bytes per node and add a counter that's only used in the head node but I haven't properly assessed the impact of it yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad monkey API supports that feature.

However we don't pass a head node pointer to mk_list_del function.
I think we need to seek the head node from a entry pointer or create a new API to pass a head node to decrement list size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true and the possible side effect could be a lot worse than the original issue, you're right.

@updogliu
Copy link

Any updates or blockers at here?

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
flb_mp_accessor_keys_remove expects incoming lists are sorted.
Since it seeks shallow keys first.
If it seeks nested key at first, it may fail to seek shallow key.

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
@nokute78 nokute78 temporarily deployed to pr October 5, 2022 11:37 Inactive
@nokute78 nokute78 temporarily deployed to pr October 5, 2022 11:37 Inactive
@nokute78
Copy link
Contributor Author

nokute78 commented Oct 5, 2022

I rebased master and modified to use mk_list_add_before.
Ready to review.

@nokute78 nokute78 temporarily deployed to pr October 5, 2022 11:57 Inactive
@DevSusu
Copy link

DevSusu commented Nov 2, 2022

@edsiper @leonardo-albertovich
I'm really happy to see the fluent-bit 2.0 release, and changes about the Loki Plugin https://github.com/fluent/fluent-bit/releases/tag/v2.0.0
with this PR merged, it'll be perfect!

@nokute78
Copy link
Contributor Author

nokute78 commented Dec 3, 2022

@edsiper @leonardo-albertovich Ping. It is ready for review.

@nokute78
Copy link
Contributor Author

nokute78 commented Jan 13, 2023

@edsiper @leonardo-albertovich Ping.
It is ready for reviewing.
Recently user requests to fix the issue again #5546 (comment) .

Copy link
Contributor

@leonardo-albertovich leonardo-albertovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don;'t see anything wrong with the code and considering it fixes a longstanding issue I think it should be merged ASAP.

src/flb_mp.c Outdated
mk_list_add(&ra->_head, &mpa->ra_list);

/* first time */
if (mk_list_size(&mpa->ra_list) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could skip this block if becaise the mk_list_foreach in insert_by_subkey_num would not satisfy the condition in the for loop when the list is empty so it would skip straight to the mk_list_add fallback right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing. You are right.

I sent a following commit.
b98c063

struct mk_list _head; /* Head to custom list (only used by flb_mp.h) */
};

int flb_ra_subkey_num(struct flb_record_accessor *ra);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace num with count everywhere? that would be much clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing.
I pushed following commit.
c9ece83

int ret = -1;
int tmp;

if (ra == NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe returning 0 in this case would be better even if that would make it undistinguishable from legitimate empty ra it would be it less error prone (ie. someone using a size_t without paying attention) since at the moment the result is not being checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer returning negative value.

In a future, the result can be checked.

@lecaros lecaros added this to the Fluent Bit 2.0.10 milestone Feb 7, 2023
Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
@nokute78 nokute78 temporarily deployed to pr February 10, 2023 23:53 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr February 10, 2023 23:53 — with GitHub Actions Inactive
Review comment:
the mk_list_foreach in insert_by_subkey_num would not satisfy the condition in the for loop when the list is empty so it would skip straight to the mk_list_add fallback

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
@nokute78 nokute78 temporarily deployed to pr February 11, 2023 00:08 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr February 11, 2023 00:11 — with GitHub Actions Inactive
@nokute78 nokute78 temporarily deployed to pr February 11, 2023 00:11 — with GitHub Actions Inactive
@nokute78
Copy link
Contributor Author

nokute78 commented Feb 11, 2023

I updated.
c9ece83
b98c063

CI failed, but I think it is not related this PR.
https://github.com/fluent/fluent-bit/actions/runs/4148684951/jobs/7176918184#step:2:41

Get:25 https://packages.microsoft.com/ubuntu/20.04/prod focal/main armhf Packages [12.3 kB]
Fetched 11.3 MB in 2s (5398 kB/s)
Reading package lists...
E: Failed to fetch https://packages.microsoft.com/ubuntu/20.04/prod/dists/focal/main/binary-amd64/Packages.gz  File has unexpected size (173872 != 173717). Mirror sync in progress? [IP: 13.82.67.141 443]
   Hashes of expected file:
    - Filesize:173717 [weak]
    - SHA512:e787843d9c76dd8fd521e59acb0894d4711a1c17715b20b70369a3954f17188e8bdc0b21628f259dcc3870153de65b3d0d3111bf567d5875a5ed1520a26474aa
    - SHA256:8056429dc12e244a28498fffe18243d5a45f524ea9ed826913dff60750f1d485
   Release file created at: Fri, 10 Feb 2023 23:59:47 +0000
E: Some index files failed to download. They have been ignored, or old ones used instead.

@nokute78 nokute78 temporarily deployed to pr February 11, 2023 00:29 — with GitHub Actions Inactive
@patrick-stephens patrick-stephens temporarily deployed to integration February 14, 2023 09:56 — with GitHub Actions Inactive
@patrick-stephens patrick-stephens temporarily deployed to integration February 14, 2023 10:03 — with GitHub Actions Inactive
@edsiper edsiper merged commit 3f9875d into fluent:master Feb 14, 2023
@edsiper
Copy link
Member

edsiper commented Feb 14, 2023

thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loki Output - remove_keys skipped if used in labels

7 participants